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Scientific  Progress 

The  acquired  equipment  is  used  in  engineering,  developing,  and  evaluating  systems  that  utilize  mathematical  and 
computational  tools  to  help  enable  us  efficiently  interact  with  the  world  around  us  or  in  our  imagination.  To  this  end,  our 
research  team  investigates  the  use  of  artificial  intelligence  and  visual  computing.  Numerous  fields  across  the  human-computer 
interaction  and  gaming  research  areas,  form  modeling  to  visualization,  rely  on  the  foundations  within  this  cross-section  of 
computing  sciences.  The  projects  for  which  the  equipment  is  utilized  serve  as  a  potentially  unique  bridge  at  the  intersection  of 
two  domains.  On  the  one  hand  a  significant  amount  of  research  has  been  invested  in  digital  gaming  and  simulation  to 
cognitively  stimulate  humans  by  computers,  resulting  in  a  $10.5  billion  entertainment  industry  [1].  On  the  other  hand,  cognitive 
computing  scientists  and  roboticists  are  engaged  in  developing  computational  models  to  enable  computers  and  robots 
understand  physical  and  cyber  environments  efficiently. 

We  believe  that  connecting  these  domains  through  an  intelligent  and  immersive  virtual  reality  environment  will  enable  the 
discovery  of  novel  paradigms  for  establishing  a  more  responsive  man-machine  cooperation.  Drawing  from  the  above  two 
conversations  may  help  answer  questions  that  are  fundamental  to  each.  For  artificial  intelligence  researchers,  the  question  is: 
What  would  make  an  artificially  intelligent  agent  more  aware  of  the  environment  and  its  users’  motives  and  intentions?  For 
game  designers,  the  question  is:  What  motivates  and  draws  players  to  engage  with  game  agents  to  perform  a  mutually 
beneficial  and  complex  objective? 

The  establishment  and  continuation  of  this  research  will  make  significant  impacts  on  a  variety  of  applications  in  which  human 
operators  need  to  be  in  communication  with  and  control  of  cyber-physical  systems,  when  such  systems  need  to  maintain  a 
sufficiently  high  level  of  autonomy.  This  efficient  human-robot-environment  interaction  and  its  associated  operations  become 
more  important  when  viewed  from  the  perspective  of  computational  efficacy  and  deployment  experiences  to  support  a  mission 
and  its  objectives. 

Detailed  Report  on  Scientific  Progress  and  Accomplishments  as  well  as  the  utilization  of  the  acquired  equipment  is  outlined  in 
the  attachment(s)  below. 
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Foreword 

The  acquired  equipment  from  this  award  is  used  to  engineer,  develop,  and  evaluate  systems  that 
utilize  mathematical  and  computational  tools  to  help  enable  us  efficiently  interact  with  the  world  around 
us  or  in  our  imagination.  To  this  end,  our  research  team  investigates  the  use  of  artificial  intelligence  and 
visual  computing.  Numerous  fields  across  the  human-computer  interaction  and  gaming  research  areas, 
form  modeling  to  visualization,  rely  on  the  foundations  within  this  cross-section  of  computing  sciences. 
The  projects  for  which  the  equipment  is  utilized  serve  in  developing  a  potentially  unique  bridge  at  the 
intersection  of  two  domains.  On  the  one  hand  a  significant  amount  of  research  has  been  invested  in  digital 
gaming  and  simulation  to  cognitively  stimulate  humans  by  computers.  On  the  other  hand,  cognitive 
computing  scientists  and  roboticists  are  engaged  in  developing  computational  models  to  enable 
computers  and  robots  understand  physical  and  cyber  environments  efficiently. 

We  believe  that  connecting  these  domains  through  an  intelligent  and  immersive  virtual  reality 
environment  will  enable  the  discovery  of  novel  paradigms  for  establishing  a  more  responsive  man- 
machine  cooperation.  Drawing  from  the  above  two  conversations  may  help  answer  questions  that  are 
fundamental  to  each.  For  artificial  intelligence  researchers,  the  question  is:  What  would  make  an 
artificially  intelligent  agent  more  aware  of  the  environment  and  its  users'  motives  and  intentions?  For 
game  designers,  the  question  is:  What  motivates  and  draws  players  to  engage  with  game  agents  to 
perform  a  mutually  beneficial  and  complex  objective? 

The  establishment  and  continuation  of  this  research  will  make  significant  impacts  on  a  variety  of 
applications  in  which  human  operators  need  to  be  in  communication  with  and  control  of  cyber-physical 
systems,  when  such  systems  need  to  maintain  a  sufficiently  high  level  of  autonomy.  This  efficient  human- 
robot-environment  interaction  and  its  associated  operations  become  more  important  when  viewed  from 
the  perspective  of  computational  efficacy  and  deployment  experiences  to  support  warfighters'  mission 
and  their  objectives. 
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Statement  of  Problem  Studied 


The  equipment  is  currently  being  utilized  in  conducting  research  in  addressing  an  over-arching  question: 
How  can  we  develop  a  framework  capable  of  merging  an  artificially  intelligent  environment  with  an 
immersive  virtual  one,  in  a  manner  that  both  environments  become  aware  of  their  user's  motives  and 
intentions,  while  drawing  the  human  operator  to  intuitively  engage  with  the  environment  and  its  agents? 
In  order  to  answer  this  question,  we  are  developing  a  cyber-physical  environment  capable  of  perceiving 
both  users'  and  robotic  agents'  actions  from  the  real  world  and  within  the  virtual  environment,  while 
processing  this  information  to  act  accordingly  upon  the  virtual  environment  and  on  the  real  world.  The 
research  activities  and  process  flow  between  the  research  components  in  our  framework  for  the 
integration  of  intelligent  physical  environments  with  an  immersive  virtual  one  are  depicted  in  Figure  1. 
There  are  two  main  research  motifs;  an  efficient  computational  framework  for  Virtual  Reality  and  Tele¬ 
robotics,  and  the  associated  human  studies  in  validating  the  performance  and  efficacy  of  the  framework. 

The  first  motif  is  comprised  of  three  research  tasks,  of  which  the  first  two  (VRT-1  and  VRT-2)  are 
interconnected.  The  objective  of  these  two  tasks  is  to  help  build  reliable  models  in  order  to  facilitate 
information  depth  and  immersion  (human's  perception)  as  well  as  information  breadth  and  interactivity 
(system's  perception).  First,  there  is  the  problem  of  modeling  interactions  between  user,  environment, 
and  robot  from  data  supplied  by  a  multitude  of  sensory  devices.  To  approach  the  issues  of  breadth  of 
information  and  interactivity  (VRT-1)  we  rely  on  robotics  and  machine  learning.  On  the  other  hand,  we 
have  the  problem  of  the  human's  perception  of  the  agent's  situation  and  its  environmental  conditions. 
This  problem  is  approached  from  the  perspective  of  the  digital  gaming  field  (VRT-2). 


Figure  1.  Main  Research  Activities 


The  third  research  task  (VRT-3  in  Figure  1)  in  support  of  this  computational  platform  is  to  enhance  the 
processing  power  of  the  framework  and  to  free  on-board  computational  needs  on  both  robotic  agents 
and  the  virtual  reality  components.  To  achieve  this  objective  we  are  studying  and  developing  data  parallel 
algorithms  for  the  data  parallel  processes  used  across  the  system.  This  computation  targets  NVidia's  CUDA 
heterogeneous  computing  platform  and  is  currently  being  performed  on  servers  based  on  NVidia's  Kepler 
architecture.  We  showed  performance  increases  of  up  to  2  orders  of  magnitude  with  the  acceleration  of 
foreground  object  detection  in  our  global  visual  sensory  systems  [12]. 


To  improve  human  user's  visual  and  auditory  stimulation  we  utilize  head-mounted  displays,  immersive 
sound  systems,  and  the  Virtuix's  Omni,  the  first  virtual  reality  interface  that  allows  for  the  player  to  move 
freely  and  naturally  in  virtual  environments.  Our  platform  of  choice  for  the  implementation  of  the  virtual 
reality  environment  is  the  Epic  Game's  Unreal  Engine  4  [13].  To  achieve  the  goals  of  the  VRT-2  track,  we 
address  two  problems:  (VRT2.1)  to  create  an  immersive  experience  and  (VRT2.2)  to  enhance  interactivity 
with  the  virtual  world.  Our  first  results  of  integrating  fine-grain  finger  movements  acquired  from  Leap 
Motion  sensors  with  the  body  kinematic  motion  from  our  Vicon  system  will  be  discussed  at  the  2016  IEEE 
Virtual  Reality  conference  [14]. 

Summary  of  Most  Important  Results 

The  proposed  study  is  situated  at  the  intersection  of  two  conversations.  On  the  one  hand,  scholars  in 
digital  gaming  and  simulation  are  researching  the  burgeoning  world  of  video  games,  an  industry  that  has 
penetrated  two-thirds  of  United  States  households  and  now  constitutes  a  $10.5  billion  industry.  On  the 
other,  roboticists  tirelessly  engage  in  developing  computational  models  to  bring  physical  objects  to  life, 
safely  and  efficiently,  in  and  around  humans. 

Entertainment  gaming  researchers  and  engineers  have  given  much  attention  to  developing  technological 
advances  aimed  at  drawing  humans  more  and  more  into  the  game  world,  through  head  mounted  displays, 
3D  body  tracking  sensors,  and  haptic  user  interfaces.  In  a  parallel  and  equally  exciting  area  of  research, 
robotics  and  artificial  intelligence  scholars  have  been  investigating  theoretical  and  algorithmic 
frameworks  to  make  robots  work  safely  and  effictively  in  the  real  world,  in  military,  industrial,  medical, 
and  domestic  applications. 

The  equipment  acquired  by  this  award  is  utilized  in  a  number  of  projects  with  the  goal  of  developing  two 
environments:  a  physical,  intelligent  environment  comprised  of  unmanned  autonomous  agents  and 
multiple  layers  of  static  and  dynamic  sensors,  and  its  virtual  replica  in  which  human  subjects  (i.e.  trainees 
and  operators)  will  be  immersed  to  tele-exist  with  their  physical  autonomous  companions  for  training  and 
teleoperations  purposes. 

Current  Projects 

The  sections  below  describe  the  current  projects  in  which  the  equipment  is  being  utilized.  First,  the  overall 
project  will  be  presented.  In  this  project  a  unified  framework  for  the  integration  of  heterogenous  robotic 
agents  within  an  immersive  virtual  reality  environment  is  developed. 

The  second  project  discusses  the  acceleration  of  visual  capabilities  for  the  proposed  framework  on  many- 
core  systems  for  the  purpose  of  detecting  objects  of  interest  within  the  unified  framework.  This  project 
shows  the  process  and  results  of  such  accelerations  in  speeding  up  the  overall  computer  vision  task  at 
hand. 

The  third  project  presents  a  mechanism  for  integrating  and  calibrating  human  hand  and  finger 
moevements  within  the  proposed  virtual  reality  framework.  This  will  enable  human  operators  to 
intuitively  communicated  with  the  robots  through  gestures  by  integrating  two  motion  capture  systems, 
i.e.  Vicon  body  motion  capture  and  Leap  Motion  hand  motion  capture. 


ArVETO:  An  Aria-Based  Client-Server  Architecture  for  the  Integration  of 
Autonomous  Robotic  Platforms  and  Remote  Users  in  a  Virtual  Reality 
Environment 


In  this  project  an  architecture  is  designed  to  integrate  a  number  of  robotic  platforms  in  interactive 
immersive  virtual  environments.  The  architecture,  termed  ArVETO  (Aria  Virtual  Environment  for  Tele- 
Operation),  is  an  Aria-based  client-server  architecture  that  communicates  directly  with  a  state-of-the-art 
game  engine  to  utilize  a  virtual  environment  in  support  of  tele-robotics  and  tele-presence. 

This  framework  employs  the  Unreal  Engine  4  (UE  4)  to  provide  the  front-end  virtual  environment  and  user 
controls,  while  utilizing  a  comprehensive  networking  architecture  to  handle  communications  between  the 
robots,  user  clients,  and  our  computational  server.  In  order  to  accelerate  data-intensive  computations  in 
support  of  such  an  interactive  and  immersive  environment,  we  utilize  the  CUDA  toolkit  and  OpenCV 
libraries  to  handle  any  calculations  needed  on  the  computational  server,  as  well  as  common  image 
processing  tasks.  The  strength  of  the  proposed  architecture  is  that  it  allows  for  the  integration  of 
heterogeneous  robotic  systems  in  an  intelligent  immersive  environment  for  intuitive  interactions 
between  the  robot  and  its  operators. 

By  utilizing  an  immersive  virtual  reality  medium,  an  operator  can  more  naturally  interact  with  the  robot; 
as  buttons  and  joysticks  can  be  replaced  with  hand  gestures  and  interactions  with  the  virtual  environment. 
This  provides  a  higher  degree  of  immersion  and  interactivity  for  the  operator  when  compared  to  more 
traditional  control  schemes. 

The  Proposed  Architecture 

In  this  section,  details  about  an  integrated  architecture  is  presented,  that  allows  the  user  to  control 
remote  robotic  agents  in  an  interactive  virtual  environment,  while  providing  mechanisms  for  the  robots 
to  efficiently  send  back  their  sensory  data  to  the  server  and  subsequently  the  user,  as  shown  in  Figure  2. 
The  user  controls  the  robots  in  a  virtual  environment.  This  provides  a  more  immersive  experience  for 
interacting  and  operating  remote  robots,  as  the  operator  senses  the  presence  of  the  robot  and  its 
environmental  conditions  remotely,  while  interacting  intuitively  with  the  robot. 


Figure  2.  An  overview  of  the  ArVETO  network  architecture. 


Robotic  agents  provide  a  wide  range  of  sensors  such  as  sonar,  laser  range-finders,  physical  bumpers,  and 
stereoscopic  cameras  that  gather  3D  information  about  their  environment.  Integrating  these  sensory  data 
into  a  3D  immersive  and  interactive  virtual  environment  will  provide  much  higher  levels  tele-presence  and 
immersion  for  control  and  operation  of  remotely  situated  robots.  In  the  purposed  architecture  a 
centralized  computational  server  is  utilized  in  order  to  mediate  the  communication  between  the  Virtual 


Reality  (VR)  client  and  the  robot  client, 
while  performing  data-intensive 
computations  required  for  the  proposed 
architecture  and  its  several  components. 

The  proposed  integrated  architecture, 
called  ArVETO  (Aria  Virtual  Environment 
for  Tele-Operation),  supports  the 
computations  essencial  for  tele-robotics 
and  tele-presence,  implemented  within 
an  interactive  and  immersive  virtual 
reality  environment.  The  proposed  system 
has  three  major  components,  comprising 
of  virtual  reality  clients,  a  centralized  High 
Performance  Computing  (HPC) 
computational  server,  and  a  number  of 
robotic  clients  -  each  specialized  to 
perform  certain  tasks.  This  framework  allows  for  multiple  clients  to  interact  with  multiple  robots  in  a 
virtual  environment,  with  the  ultimate  goal  of  remotely  operating  the  agents  while  allowing  for  high- 
fidelity  tele-presence  by  the  human  operators.  An  overview  of  this  architecture  is  shown  in  Figure  3. 

The  boxes  on  the  right  represent  the  processes  performed  on  the  physical  robotic  platforms,  while  the 
items  on  the  left  represent  computations  performed  by  the  UE4  clients.  The  computational  sever 
processes  connect  these  two  types  of  clients  together.  The  robots  stream  raw  sensory  data  to  the 
computational  server  to  be  stored  and  processed  as  well  as  retrieved  by  the  UE4  clients  as  needed.  In 
addition,  some  of  the  retrieved  data  is  sent  to  either  a  UE4  listen  or  dedicated  server  to  utilize  actor 
replications  for  streaming  properties  of  the  virtual  robot  -  such  as  location,  orientation,  and  other 
environmental  conditions-to  all  of  the  connected  UE4  clients.  Afterwards,  the  operator  can  examine  the 
state  of  the  virtual  robot  and  environment  and  send  operational  commands  back  to  the  robot,  through 
the  UE4  client's  direct  connection  to  the  computational  server  and  the  server's  connection  to  the  robotic 
platform. 

The  benefit  of  the  ArVETO  network  architecture  is  threefold.  First,  it  provides  a  traditional  client-server 
architecture  that  minimizes  the  network  bandwidth  required  by  reducing  the  total  network  connections 
and  transactions  required  by  the  architecture.  In  addition,  this  server  can  process  data-intensive 
computations  needed  in  support  of  the  entire  system.  These  computations  must  be  performed  on  the 
raw  sensory  data  to  potentially  reduce  the  amount  of  the  data  needed  to  be  sent  to  each  UE4  client  and 
to  improve  the  accuracy  of  the  UE4  virtual  environment.  Second,  the  ArVETO  architecture  uses  UE4  actor 
replication,  to  efficiently  stream  the  robots'  properties  to  further  reduce  the  network  bandwidth.  Finally, 
we  utilize  the  concept  of  network  relevancy.  That  is,  each  UE4  client  in  the  ArVETO  architecture 
communicates  to  the  server  from  which  robot,  if  any,  it  requests  data.  This  allows  the  UE4  clients  to  cull 
robots,  either  because  they  are  out  of  focus  of  the  operator  or  because  they  are  too  far  away  from  the 
virtual  operator  to  be  of  significant  impact.  This  relevancy  mechanism  reduces  network  bandwidth  even 
further.  This  reduce  in  bandwidth  is  crucial,  as  all  calculations  and  transactions  in  the  ArVETO  architecture 
are  performed  in  real-time. 

ArVETO  is  designed  to  support  several  projects  aimed  at  creating  an  over-arching  umbrella  for  immersive 
virtual  reality,  robust  telerobotics,  and  interactive  tele-presence  systems.  These  projects  include  online 
3D  reconstruction  and  procedural  generation  of  the  physical  environment,  robust  networked  connections 
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Figure  3.  An  overview  of  the  proposed  framework  architecture. 


for  the  remote  physical  agents  and  virtual  operators,  Simultaneous  Localization  and  Mapping  (SLAM)  for 
each  of  the  robotic  clients,  a  teleoperation  system  based  upon  an  efficient  mapping  from  the  users  actions 
within  the  virtual  environment  to  robots  actions  on  the  physical  environment,  and  the  accurate 
presentation  of  robots  actions  back  into  the  virtual  environment.  As  such,  ArVETO  has  been  designed  from 
the  ground  up  with  each  of  these  tasks  in  mind,  resulting  in  a  versatile  and  generalized  architecture. 

Architecture 

As  previously  stated,  ArVETO  is  comprised  of  three  major  components.  Each  of  these  components  and 
their  implementations  are  outlined  in  Figure  4,  and  each  of  them  are  detailed  below.  This  figure  shows 
how  communication  between  UE4  and  each  of  the  Aria  clients  can  be  achieved. 


Figure  4.  Allocation  of  work  and  overall  communications  between  the  Aria  SDK  and  Unreal  Engine  4  in  the  proposed 

architecture. 

Suppose  we  operate  one  robotic  client  through  a  single  UE4  client  within  the  ArVETO  architecture.  From 
the  UE4  game  thread,  any  actions  performed  on  the  environment  by  the  user  must  be  sent  to  the  Aria 
client  thread  before  it  can  be  sent  to  the  other  ArVETO  network  components.  This  Aria  client  is  connected 
to  the  computational  server  as  well  as  the  robotic  platform's  client  component.  Any  data  sent  by  the  UE4 
client  must  first  pass  through  the  computational  server,  which  can  validate  and  process  the  data  before  it 
is  sent  to  the  desired  robot  platform.  The  server  contains  a  table  of  clients,  which  can  efficiently  match  a 
client  ID  to  its  current  socket. 

Finally,  any  commands  that  reach  the  desired  robot  are  executed  by  communicating  with  the  ArRobot 
component  of  the  robotic  platform.  The  robot  also  sends  data  to  UE4  in  a  similar  manner.  Most  sensory 
data  are  retrieved  from  the  ArRobot  component  and  sent  to  the  server.  Flowever,  stereo  image  pairs  must 
first  be  retrieved  from  the  FlyCapture  2  and  Triclops  pipeline  and  compressed  using  OpenCV  in  order  to 
minimize  the  needed  bandwidth. 

Results 

With  the  ArVETO  architecture  we  were  able  to  connect  the  mobile  robots  and  the  VR  client  to  the 
centralized  server  to  perform  remote  operations  and  navigational  tasks.  This  allowed  us  to  control  the 
robot  in  a  virtual  environment  as  it  moved  through  an  identical  physical  environment.  The  robot  was  able 
to  successfully  navigate  through  our  virtual  environment  while  moving  through  the  physical  world. 


Figure  5  shows  the  physical  robot  in  the  hallway  (the  left  image)  and  the  virtual  robot  in  the  VR  hallway 
(right  image),  respectively.  The  navigation  of  the  Patrolbot  was  done  by  an  operator  observing  the  robot's 
location  via  the  stereoscopic  cameras  on  both  the  physical  and  virtual  robot.  Autonomous  navigation  by 
the  robot  can  be  conducted  by  putting  the  robot  in  autonomous  mode  and  without  user  intervention.  The 
virtual  reality  environment  in  which  the  robot  operates  is  a  replica  of  an  indoors  hallway  with  physical 
objects  and  obstacles  present.  This  experiment  showcases  the  differences  and  similarities  between  the 
teleoperated  robot  and  the  VR  robot. 


Figure  5.  The  robot  operation  with  ArVETO:  Left-  PatrolBot  in  the  real  world.  Right-  the  Patrol ESot  as  observed  and  operated 
in  the  Virtual  World  through  Oculus  Rift  HMD  (Head  Moutned  Display). 


An  Efficient  Non-Parametric  Background  Modeling  Technique  with  CUDA 
Heterogeneous  Parallel  Architecture 

Foreground  detection  plays  an  important  role  in  many  content  based  video  processing  applications.  To 
detect  moving  objects  in  a  scene,  the  changes  inherent  to  the  background  need  to  be  modelled.  In  this 
work  we  propose  a  non-parametric  statistical  background  modeling  technique.  Moreover,  the  proposed 
modeling  framework  is  designed  to  utilize  the  Nvidia's  CUDA  architecture  to  accelerate  the  overall 
foreground  detection  process.  We  present  three  main  contributions:  (1)  a  novelty  detection  mechanism 
capable  of  building  accurate  statistical  models  for  background  pixels;  (2)  an  adaptive  mechanism  for 
classifying  pixels  based  on  their  respective  statistical  background  model;  and  (3)  the  complete 
implementation  of  the  proposed  approach  based  on  the  Nvidia's  CUDA  architecture.  Comparisons  and 
both  qualitative  and  quantitative  experimental  results  show  that  the  proposed  work  achieves 
considerable  accuracy  in  detecting  foreground  objects,  while  reaching  orders  of  magnitude  speed-up 
compared  to  traditional  approaches. 

This  objective  of  this  research  is  to  track  multiple  objects  in  videos  with  complex  backgrounds,  also  called 
quasi-stationary  backgrounds  [2],  There  are  three  main  approaches  in  developing  scene  independent, 
non-parametric  object  tracking  frameworks  that  detect  and  track  foreground  objects  while  disregarding 
background  changes  [3] [4] [5] .  In  order  to  track  objects  in  real-time,  spatio-spectral  connected  component 
processing  mechanism  may  be  utilized  to  employ  photometric  appearances  of  individual  objects  to  assign 
them  unique  ID's  [6], 

This  mechanism  is  suitable  to  be  utilized  in  a  robotic  application  to  detect  potential  threats  by  recognizing 
the  intent  of  humans  present  in  a  scene  [7],  We  will  build  on  this  idea  to  utilize  robots  in  an  environment 
to  recognize  the  intent  of  other  agents,  i.e.  robots  and  people  [8] [9] . 

Methodology  and  Approach 

In  order  to  accelerate  the  foreground 
detection  for  real-time  immersive  virtual 
reality  tele-robotics  framework  we 
propose  a  novel  non-parametric 
approach  for  foreground  detection, 
capable  of  exploiting  parallelism 
available  through  Nvidia  CUDA.  Our 
proposed  mechanism  is  comprised  of 
three  main  stages  and  two  optional  post¬ 
process  modules.  Figure  6  shows  the 
workflow  utilized  in  the  proposed 
framework.  The  update,  training,  and 
classification  phases  of  the  proposed 
framework  in  Figure  6  are  represented  as 
the  lateral  sections.  The  horizontal 
sections  represent  stages  of  the 
proposed  architecture  as  preprocess, 
post-process  or  the  detection  stage. 

The  first  stage  is  an  update  phase,  executed  on  every  frame  to  update  the  background  information  based 
on  the  newly  available  data.  This  data  is  then  analyzed  in  the  training  phase  to  produce  a  model  for  each 
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Figure  6.  Acceleration  of  the  Vision  Processes 


background  pixel  in  the  frame.  The  training  phase  is  the  slowest,  but  only  needs  to  be  executed  on  a 
fraction  of  the  available  blocks  every  frame.  Finally,  a  classification  phase  is  executed  on  every  frame  to 
determine  if  each  pixel  belongs  to  the  background  or  foreground  using  strict  and  loose  classification 
criteria.  Two  post-process  modules  are  also  employed,  in  real-time,  to  refine  the  foreground  detection 
results  and  remove  undesirable  artifacts  and  noise. 


Figure  7.  Sample  qualitative  comparisons.  From  left:  original  frame,  ground  truth,  detected  foreground  by  our  approach, 

theFTSG  [16],  and  [17). 


Experimental  Results 

Qualitative  and  Quantitative  comparisons  have  been  made  with  other  state-of-  the-art  foreground 
detection  methods  shown  on  the  www.changedetection.net  website  [15].  These  comparisons  range  from 
11  categories;  each  containing  four  to  six  videos.  Each  video  has  been  scored  based  on  seven 
measurements  against  the  ground  truths  provided  by  the  changedetection.net  website. 

We  compared  our  results  to  the  original  frame,  ground  truth,  and  two  other  state-of-the-art  foreground 
detection  methods.  Figure  7  shows  some  of  these  results.  Each  row  shows  from  right  to  left;  The  sequence 
frame,  the  ground  truth,  our  proposed  method's  results,  the  FTSG  [16]  method's  results  and  the  Euclidean 
Distance  method's  results  [17]. 

Our  proposed  method  has  been  ranked  with  all  of  the  published  state-of-the-art  methods  of  background 
segmentation  on  the  changedetection.net  website  using  all  available  data,  and  the  results  have  been 
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reported  in  [12].  These  results  show 
that  our  approach  is  comparable  in 
quality  to  the  traditional  methods. 

Most  of  traditional  methods  report  a 
processing  speed  at  below  24  fps  on  a 
320x240  video  [15],  where  our 
algorithm  has  achieved  a  maximum 
processing  speed  of  1146  fps  on  the 
same  320x240  sequences  using  an 
Nvidia  Tesla  K80  cluster.  The 
processing  speeds  of  the  top  four 
methods  based  on  average  ranking  is 
shown  in  Figure  8.  The  proposed 
method  used  less  than  3.5%  of  the 
device  on  average  when  processing  all 

of  the  changedetection  videos,  and  we  have  achieved  similar  results  on  a  commercial  Nvidia  graphics  card. 
In  comparison,  the  sequential  version  of  our  algorithm,  using  an  enterprise  Intel  Xeon  E5-2620  v3  CPU 
and  the  same  parameters,  processed  38  times  slower  on  a  small  resolution  (320x240)  sequence,  and  82 
times  slower  on  a  large  resolution  (720x480)  sequence. 


320x240 
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Figure  8.  Speed  comparisons  between  the  top  4  average  ranking 
methods  -[16],  [18],  [19]  and  the  proposed  technique. 


A  Full-Body  Motion  Calibration  and  Retargeting  for  Intuitive  Object  Manipulation 
in  Immersive  Virtual  Environments 


In  this  project  a  system  is  proposed  to  combine  small  finger  movements  with  the  large-scale  body 
movements  captured  live  from  a  motion  capture  system  or  provided  by  pre-programmed  animation  data. 
The  strength  of  the  proposed  work  over  previous  research  is  in  the  real-time  performance  of  integrating 
small-scale  finger  and  hand  animation  with  the  full-body  skeletal  animation.  This  provides  a  higher  degree 
of  immersion  and  interactivity  when  compared  to  more  traditional  virtual  reality  systems  which  use 
traditional  user  interfaces.  A  number  of  experiments  are  conducted  with  humanoid  skeletons  that  are 
both  similar  to  an  actual  humane  -  e.g.  a  SWAT  officer,  and  those  that  are  dissimilar  -  e.g.  a  Gremlin,  to 
showcase  the  performance  of  the  proposed  approach. 

Methodology  and  Approach 

Vicon  Pegasus  software  is  an  excellent  starting  point  for  the  reproduction  of  skeletal  movements  in  virtual 
reality  environments.  Pegasus  handles  the  retargeting  and  streaming  of  skeletal  bodies  into  virtual 
environments,  and  is  supported  by  several  state-of-the-art  game  engines  such  as  Unity  and  Unreal  Engine 
4.  Using  software  such  as  Vicon  Blade  or  Tracker  combined  with  Pegasus  offers  a  level  of  accuracy  that  is 
unrivaled  when  compared  to  similar  marker-less  setups. 

However,  these  kinds  of  setups  have  difficulty  in  gathering  data  about  joints  that  are  too  close  in 
proximity;  particularly  the  fingers.  Therefore,  markered  setups  are  unequipped  to  handle  tasks  such  as 
gesture  detection  without  including  additional  sensors.  With  the  addition  of  a  Leap  Motion  Controller,  the 
data  specific  to  each  finger  can  be  blended  to  the  Pegasus's  body  pose  to  fully  recreate  a  skeletal  body  in 
a  virtual  environment.  This  Leap  Motion  Controller  can  be  mounted  either  on  a  desk  for  sitting  motions 
or  directly  on  the  HMD  for  standing  motions  in  order  to  provide  hand  motions  to  all  gestures  that  are 
performed  in  front  of  the  user. 

In  order  to  include  the  LeapMotion  data  in  an  existing  pose,  such  as  the  one  provided  live  from  a  Vicon 
motion  capture  system  via  Pegasus  or  from  pre-animated  sequences,  the  data  will  need  to  be  retargeted 
to  match  the  hand.  Due  to  the  ambiguity  of  the  user's  hand  dimensions,  it  is  impractical  to  use  a  one-to- 
one  retargeting  of  each  bone's  orientation.  Instead,  an  Inverse  Kinematics  (IK)  solver  should  be  employed 
to  guarantee  that  each  of  the  finger's  tip  positions  in  the  virtual  skeleton  is  consistent  with  the  leap  motion 
data.  To  better  match  the  leap  motion  data  to  the  skeletal  body,  a  calibration  mode  will  be  included  to 
approximate  the  maximum  and  minimum  distance  from  the  base  of  each  finger  to  the  tip.  These 
measurements  will  allow  the  leap  motion  data  to  be  normalized  and  used  to  better  match  the  finger 
lengths  of  the  virtual  skeleton.  Implementation  details  are  presented  in  [14]. 

Our  platform  of  choice  for  the  implementation  of  the  virtual  reality  environment  is  Epic  Game's  Unreal 
Engine  4.  The  game  engine,  developed  by  Epic  Games  Inc.,  is  comprised  of  an  advanced  graphics  rendering 
engine,  sound  engine,  and  physics  and  animation  engines.  This  game  engine  is  capable  of  delivering 
unparalleled  performance  in  3D  realistic  gameplay,  simulation  and  visualization  [13].  Unreal  Engine  4 
(UE4),  the  latest  major  version  of  the  engine,  was  released  in  April  2014.  New  to  this  release  are  several 
completely  redesigned  architectures  that  we  are  planning  to  utilize  in  this  research. 

We  generated  a  range  of  qualitative  results  using  two  different  skeletons  in  Figure  9.  For  each  skeleton, 
we  captured  a  rendering  of  the  final  pose  for  four  different  hand  gestures  without  the  leap  motion  (left), 
with  the  un-calibrated  leap  motion  (middle),  and  with  the  calibrated  leap  motion  (right).  The  vast 
difference  in  the  uncalibrated  and  calibrated  results  reflects  the  need  for  the  user-specific  calibration 
data.  With  a  one-to-one  mapping  from  the  user's  hand  to  the  skeleton's  hand,  a  skeleton  made  specifically 


for  the  user's  hand  dimensions  would  be  required.  However,  any  user's  hand  can  be  re-targeted  to  any 
skeletal  hand  for  fine  motor  controls  and  gesture  detection  with  the  calibration  process. 


(a)  Close  hand  pesluie. 


(a)  Clot?  hand  gesture. 


<bl  Open  hand  gesture. 


(bi  Open  hand  gesture. 


(dr  V-shaped  hand  oeslute.  (d)  V-shaped  hand  gesture. 


Figure  9.  Sample  qualitative  results  achieved  by  the  proposed  approach  using  a  humanoid  skeleton  with  both  similar  hand 
proportions  (left)  and  dissimilar  hand  proportions  (right)  when  compared  to  the  user.  Images  in  each  group  from  left:  No  leap 
motion  data,  un-calibrated  leap  motion  data,  and  calibrated  leap  motion  data. 


Additional  Projects  of  Interest 

Interactivity  Related  Research  Projects 

A  significant  research  question  that  we  are  currently  investigating  addresses  the  problem  of  facilitating 
the  interactivity  and  breadth  of  information  provided  in  our  virtual  reality  applications  for  seamless  tele¬ 
operation  of  robotic  agents.  Therefore,  we  need  to  have  reliable  and  robust  means  of  manifesting  the 
human  user's  actions  with  important  and  tangible  effects  into  the  virtual  environment,  and  from  there  to 
the  physically  tele-operated  environment  (i.e.  the  real  world  where  robotic  agents  operate). 

To  approach  this  problem,  our  team  is  mainly  focusing  on  two  research  questions.  First,  we  will  approach 
the  fusion  of  sensory  information  to  design  accurate  models  for  the  control  of  a  remote  agent  by 
retargeting  human  gestures  (or  body  part  movements)  on  the  control  structure  of  the  remote  agent.  Next, 
we  will  investigate  and  design  higher-order  models  of  the  user's  activity  (and  intent)  by  utilizing  the 
theory  of  mind. 

The  robots  first  will  be  given  training  by  observing  retargeted  movement  patterns  of  the  operator.  When 
each  structured  activity  (such  as  opening  a  door,  fastening  a  screw,  etc.)  is  learned  by  the  robot,  it  will  be 
able  to  take  the  perspective  of  its  human  operators  later  when  it  encounters  similar  activities  to  infer 
intent.  This  will  further  facilitate  the  interactivity  and  responsiveness  of  the  robotic  agents  in  support  of 
the  proposed  framework. 

Sensor  Fusion  for  Tele-operation 

In  this  phase,  we  will  investigate  important  factors  in  designing  efficient  modes  for  the  tele-operation  of 
remote  agents  from  data  supplied  by  our  different  sensory  systems.  Each  of  these  sensors  comes  with  a 
different  rate  of  accuracy.  Therefore,  we  will  first  establish  ground-truth  sequences  of  sensory 
information  based  on  the  most  accurate  sensory  system  in  our  portfolio. 


(a)  (b) 


Figure  10.  Motion  capture  tracking  data  from  Vicon  T-160  camera  system  for  a  normal  gait,  (a)  3D  reconstruction  of 
the  actor  movements,  (b)  Joint-bone  3D  rotational  data  for  the  ankle. 


We  have  acquired  over  600  hours  of  motion  capture  sequences  for  a  large  number  of  poses  and  activities, 
such  as  regular  walk,  run,  and  jump  cycles  for  both  males  and  females.  Figure  10  shows  a  3D 
reconstruction  of  an  actor's  normal  gait  pattern  captured  by  the  Vicon  T-160  camera  system.  The  sub¬ 
millimeter  accuracy  of  this  camera  system  and  its  global  independence  with  respect  to  the  global 
coordinate  system  makes  it  a  perfect  candidate  to  be  used  as  reference  for  calibration. 

One  important  aspect  of  developing  an  affordable  and  tailorable  virtual  reality  environment  is  the  variety 
of  costs  for  its  different  types  of  sensors.  On  the  one  hand,  the  Vicon  T-160  motion  capture  cameras 


provide  very  accurate  motion  information,  but  at  a  very  high  cost,  both  in  terms  of  budget  and  the 
expertise  required  to  operate  such  a  system.  On  the  other,  very  low  cost  visual/range  sensors  such  as 
Kinect  cameras  will  provide  motion  data  at  a  lower  cost  but  with  less  accuracy.  We  propose  to  utilize  the 
various  sensory  systems  within  this  project  to  build  models  of  higher  order  information,  such  as  activities, 
intents,  and  overall  scene  models  across  different  types  of  sensors. 

To  build  the  higher-order  models  we  leverage  the  baseline  positional  data  to  validate  the  accuracy  of 
other  sensory  systems  and  their  applicability  and  efficiency  for  data  collection.  This  part  of  our  studies  will 
bear  a  great  amount  of  resemblance  to  the  studies  we  have  planned  to  conduct  in  support  of  the 
immersion  and  depth  of  information  components. 

By  establishing  higher  order  models  of  interaction  across  a  variety  of  sensory  systems,  we  can  integrate 
on-board  sensors  from  the  robots  (i.e.  visual  and  range  sensors),  the  virtual  reality  wearable  sensors  (i.e., 
the  data  gloves  and  head-mounted  display),  and  the  VR  motion  interfaces  (i.e.,  feet  and  pose  sensors), 
with  the  global  sensors  (i.e.,  the  T-160  optical  data  and  the  Bonita  video  references).  In  this  stage,  models 
of  interaction  between  the  operator  and  the  world  will  be  built  based  on  the  accuracy  and  applicability  of 
each  sensory  system  to  supply  the  tele-operation  component. 

Once  we  have  suitable  models  of  interaction  designed  and  validated  for  each  sensory  system  in  the  final 
framework,  we  will  look  at  utilizing  these  models  of  interaction  into  suitable  control  mechanisms  for  tele¬ 
operation  of  the  robotic  agents.  This  phase  of  our  research  will  be  platform  dependent,  as  different  robots 
will  have  different  modes  of  operation. 

To  control  a  humanoid  robot  with  a  skeletal  structure  similar  to  that  of  its  operator,  we  will  investigate 
different  mechanisms  to  create  robust  retargeting  functions  (similar  to  those  used  in  retargeting 
animation  between  different  3D  models)  to  transfer  the  kinematic  data  from  global  and/or  local  (and 
wearable)  sensors  to  the  skeletal  structure  of  the  controlled  robot.  For  non-humanoid  robots,  the 
retargeting  models  will  be  designed  to  suit  the  robot's  actual  mode  of  operation  to  control  its  motors  by 
human  kinetics. 

Situational  A  ware  ness 

An  important  aspect  of  our  proposed  architecture  is  its  capability  to  interact  with  human  users  and 
operators  in  an  intuitive,  robust  and  reliable  manner.  To  this  end,  the  proposed  architecture  needs  to 
make  intensive  use  of  high-level  intelligent  tasks  such  as  inferring  user  intentions,  understanding  contexts, 
as  well  as  the  ability  to  learn.  Most  of  these  tasks  require  the  system  to  process  information  represented 
in  the  visual  domain.  Visual  and  range  information  may  be  combined  to  deliver  more  reliable  patterns  for 
processing  by  robots  in  our  integrated  system.  Therefore,  this  phase  will  greatly  benefit  from  the 
successful  sensor  fusion  track. 

A  reliable  and  robust  intelligent  environment  which  is  designed  to  co-operate  with  human  inhabitants  will 
need  to  posses,  on  some  levels,  a  theory  of  mind  [20],  This  will  enable  the  system  to  perceive  human 
operators'  actions,  and  based  on  prior  experiences,  to  predict  potentially  useful  intentions  which  the 
operators  or  cohabitants  had  in  mind.  We  will  statistically  evaluate  different  cognitive  models  for  efficient 
control  of  our  robotic  agents.  This  will  provide  particular  prior  information  to  augment  our  models  of 
interaction  to  facilitate  the  implementation  and  development  of  the  proposed  theory  of  mind  for  intent 
recognition.  Further  use  of  prior  information  for  augmenting  the  model  are  successfully  used  by  Kelley  et 
al.  [8],  while  the  FIMMs  are  used  to  infer  high-level  cognitive  functions  such  as  intent  recognition  in  the 
visual  domain[9].  Moreover,  we  have  planned  to  enhance  these  models  by  bringing  more  data  rich 
sensory  information  and  by  developing  data  parallel  algorithms  to  outsource  the  processing  of  this  data 
to  a  GPU  Accelerated  platform. 
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Appendixes 


Appendix  A  -  Equipment  and  Budget  Description 


A.  Senior/Key  Personnel  Salary  &  Wage 

None  requested. 


B.  Other  Personnel  Salary  &  Wage 

During  the  summer  of  2015  the  PI  submitted  a  proposal  to  host  one  high  school  student  and  one 
undergraduate  student  at  the  CAVE  lab  at  UHV  to  work  with  the  equipment,  on  two  research 
projects  relevant  to  the  research  activities  of  the  funded  proposal.  The  funding  for  these  students 
was  supported  by  HSAP  and  URAP  programs. 

Student  employees  [$5,300.32] 


C.  Equipment 

This  proposal  requested  the  funding  for  a  unified  tele-operations  system  in  an  immersive  virtual 
reality  environment  to  enable  research  in  sensory  data  fusion  and  processing,  heterogeneous 
computational  architecture  in  support  of  such  processes,  and  human  studies  of  interaction  and 
trust.  Three  components  are  essential  to  establish  this  research  capability.  Each  one  of  the 
following  components  is  integral  to  the  research  agenda  and  additional  educational  utilization. 
Together,  Components  1-3,  including  modules  A  and  B  of  Component  1,  comprise  a  single 
system.  On  its  own,  an  individual  component  would  lack  utility  value. 


System  Component  1:  Virtual  Reality  Environment 

The  Virtual  Reality  Environment  will  consist  of  two  cooperative  modules,  a  video  capture  module 
(A)  and  virtual  reality  client  module  (B). 

Module  A  -  Vicon  Bonita  Video  Capture  Module 

The  purpose  of  this  module  is  to  provide  an  accurate  and  robots  sensory  system  in  the  visual 
domain.  The  research  team  is  utilizing  this  component  of  the  overall  system  to  calibrate  visual 
and  non-visual  tracking  data  acquired  from  T-160,  Bonita,  Virtual  Glove,  and  other  sensors  to 
establish  and  validate  models  of  activities,  intentions,  and  tracking.  This  module  consists  of  the 
following: 


Four  Bonita  720c  Video  Cameras  ($10,152.22  total)  will  be  necessary  to  generate  models  for  the 
visual  domain  from  motion  data  captured  by  UHV's  existing  T-160  system.  They  are  also 
important  in  validation,  to  test  for  accuracy  by  cross-analysis  with  the  T-160  motion  data.  The 
cameras  operate  with  two  Bogen  super  clamp  short  studs  at  $45  each  ($90  total);  four  microball 
heads  at  $75  each  ($300  total);  and  two  light  duty  tripods  at  $225  each  ($450  total)  for  mounting. 
[$10,992.22] 

One  Giganet  LAB  unit  of  $11,052.82  connects  the  T-160  optical  cameras  and  the  Bonita  cameras 
to  the  host  capture  PC,  essentially  synchronizing  the  system  via  the  Bonita  to  Giganet  Cable  and 
six-foot  CAT5E  cable  (included  in  cost).  [$11,052.82] 

One  Vicon  Active  Wand  of  $818.73  is  necessary  to  calibrate  the  optical  and  reference  cameras 
with  5-point  accuracy.  [$818.73] 

One  Quad  Video  PC  of  $2,922.86  is  essential  to  capably  process  for  the  visual  tracking  system  and 
to  synchronize  the  Bonita  video  camera  feed  with  the  T-160  optical  motion  capture  feed.  The  PC 
also  requires  two  Dell  24-inch  monitors  ($532.17  total).  [$3,455.03] 

Vicon  Nexus  Software  for  this  PC  is  included  in  Other  Direct  Costs,  number  4,  ADP/Computer 
Services.  Installation  and  training  costs  from  Vicon  are  included  in  Other  Direct  Costs,  number  3, 
Consultant  Services. 

SUBTOTAL:  $26,318.80 


Module  B  -  Virtual  Reality  Client  Module 

The  purpose  of  this  module  is  to  facilitate  the  immersion  of  human  users  and  operators  into  the 
virtual  reality  environment,  and  acquire  motion  patterns  to  enable  interactions  between  virtual 
components  of  the  system  and  virtual  environment.  This  module  consists  of  the  following: 

Two  Alienware  Aurora  R4  Desktop  computers  ($7,339.96  total)  are  used  exclusively  to  ensure 
that  graphics  and  visualization  processes  run  efficiently  via  their  NVidia  Graphics  Processing 
Units.  The  desktops  are  needed  to  supplement  the  computers  in  the  CAVE  lab  for  general  use  by 
devoting  these  two  computers  to  usage  related  especially  to  the  research  activities  described  in 
the  technical  portion  of  this  report  and,  as  applicable,  to  educational  utilization.  The  set  includes 
one  of  each  component,  such  as  the  monitor,  processor,  keyboard,  operating  system,  resource 
DVD,  CD/DVD  drive,  warranties,  and  standard  software,  inclusive  in  the  quoted  price.  [$7,339.96] 

Two  VirtualGlove  Data  Gloves  (1  pair)  of  $5,000  per  glove  ($10,000  total)  is  an  important  tool  for 
achieving  precision  in  tracking  minute  movements  of  users'  hands  and  fingers  and  transmitting 
this  information.  A  shipping  cost  of  $100  is  included.  [$10,100] 


SUBTOTAL:  $17,439.96 


System  Component  2:  Computational  Backbone 


The  purpose  of  this  component  is  to  support  the  computational  needs  for  the  machine  learning 
and  pattern  recognition  algorithms  for  building  and  evaluation  of  models  of  activities,  intents  and 
emerging  interactions  between  virtual  and  physical  worlds.  This  component  consists  of  the 
following: 

The  Mercury  GPU  408  Tower  Server,  pretested  with  x2  Tesla  K80  Graphics  Processing  Units 
(GPUs),  priced  at  $15,513.88,  is  crucial  to  handle  processing  for  the  high  volume  of  data 
generated  in  the  capture  environment  and  building  models  for  the  autonomous  agents.  This  tool 
will  allow  us  to  synchronize  the  robots  with  the  virtual  reality  system  as  it  performs  modeling, 
training,  and  simulation  processes.  It  consists  of  a  tower,  motherboard,  keyboard,  speakers,  and 
3-year  warranty,  among  the  items  listed  in  the  quotation.  Important  components  include  two 
Intel  Xeon  E5-2620,  6C,  2.40GHz  processors  and  two  K-80  NVidia  Tesla,  24GB  peripheral 
component  interconnect  express  (PCI-Es).  [$15,513.88] 

SUBTOTAL:  $15,513.88 


System  Component  3:  Autonomous  Operational  Robots 

This  component  is  the  physical  component  of  the  system  which  directly  interacts  with  humans. 
As  such  each  of  the  proposed  robots  are  playing  a  major  role  in  enabling  our  team  to  study 
various  means  of  co-operation  and  tele-operation  needed  for  this  project.  It  consists  of  the 
following: 

One  PeopleBot  at  $31,995  along  with  its  digital  stereo  camera  ($9,995)  and  serial  tether  ($43)  is 
central  to  our  human  studies.  It  will  be  used  to  map  the  environment  and  communicate  with 
human  operators.  The  autonomous  capabilities  of  this  robot  and  the  ability  to  fully  program  it  in 
both  autonomous  and  tele-operated  mode  as  well  as  its  intuitive  interfaces  to  communicate  with 
humans  will  allow  us  to  perform  complex  task  and  human  studies  to  validate  the  human-machine 
trust  and  performance  for  complex  scenarios.  Shipping  costs  are  $500.  [$42,533] 

One  PatrolBot  at  $44,995  is  a  reliable  research  robot  and  comes  with  an  arm  ($14,995)  and 
gripper  ($350).  It  will  be  used  for  tasks  such  as  calibrating  the  motion  capture  facility,  tele¬ 
operation  tasks,  and  autonomous  tasks  within  the  environment.  Without  this  robot  we  could  not 
perform  complex  tasks  which  require  a  functional  robotic  arm  with  the  required  degrees  of 
freedom.  Shipping  costs  are  $500.  [$60,840] 

One  NAO  at  $8,200  is  a  humanoid  robot  that  will  be  used  both  for  trust  studies  and  tele-operation 
tasks.  The  anthropomorphic  shape  and  feel  of  this  robot  and  the  25  DoF  afforded  by  NAO  will  be 
an  instrumental  component  for  human  studies  and  validation  of  human-robot  trust  models. 
[$8,200] 

SUBTOTAL:  $111,573.00 


The  four  systems  are  essential  to  completely  outfit  UHV  and  its  students  with  the  capability  to 
carry  out  the  proposed  research.  In  sum,  the  proposed  system  is  comprised  of  a  system  of 
interconnected  components,  namely:  the  virtual  reality  environment  (component  1),  the 
computational  backbone  (component  2),  and  autonomous  robotics  agents  (component  3).  The 


computational  backbone  will  interact  directly  with  the  other  two  components  and  will  be  in 
charge  of  computational  modeling  of  calibration,  machine  learning,  and  sensory  data  integration 
as  well  as  facilitating  the  interactions  between  physical  and  virtual  components  of  the  system. 
The  Virtual  Reality  modules  of  the  system  will  locally  process  the  sensory  data  from  client-side 
and  the  environment  through  the  capture  component.  Finally,  the  robotics  agents  will  be  the 
physical  component  of  the  system  which  will  interface  with  the  other  two  components  to 
facilitate  the  human  robot  interaction  and  help  with  our  studies  of  human-robot  trust  as  well  as 
performance  of  such  interactions. 

TOTAL  EQUIPMENT  COSTS:  $170,845.64 


D.  Travel 

None  requested. 


E.  Participant/Trainee  Support  Costs 

None  requested. 


F.  Other  Direct  Costs 


1.  Materials  and  Supplies 

One  Alienware  17  (210-ACKC)  Laptop  of  $1,812.99  is  requested  for  remote  applications,  and  an 
external  harddrive  of  $69.99.  It  is  essential  to  the  outreach  and  dissemination  components  of  any 
research  to  provide  demonstrations  and  present  findings,  as  Dr.  Tavakkoli  intends  to  do  at  high 
schools  and  academic  conferences.  It  will  be  used  in  tandem  with  the  requested  equipment  only. 
[$1,882.98] 

For  our  remote  applications  Alienware  17  (210-ACKC)  laptop,  we  request  a  travel  briefcase  of 
$99.99,  already  included  in  the  quotation.  The  Vindicator  briefcase  is  made  specially  to  fit  this 
laptop  and  provides  a  high  quality  of  protection.  [$99.99] 

SUBTOTAL:  $1,982.97 


2.  Publication  Costs 


None  requested. 

3.  Consultant  Services 


Installation  and  training  for  the  Vicon  Bonita  Video  Capture  system  is  necessary  for  a  Vicon 
engineer  to  spend  two  days  at  UHV  to  install  and  train  Dr.  Tavakkoli  in  the  use  of  the  system.  [$0] 

SUBTOTAL:  $0.00 


4.  ADP/Computer  Services 

The  Vicon  Nexus  Software  of  $15,871.20  will  be  operated  on  the  Quad  Video  PC  which  allows  for 
system  calibration,  data  capture,  processing,  and  exporting  to  other  software.  [$15,871.20] 

SUBTOTAL:  $15,871.20 


5.  Special  Circumstances 

Originally  a  4DoF  gripper  for  the  robotic  arms  (originally  quoted  at  $2,995)  was  requested.  After 
the  grant  was  awarded,  the  vendor  and  manufacturer  informed  the  PI  that  the  4DoF  gripper  is 
going  through  a  redesign  phase.  The  gripper  redesign  was  not  completed  by  the  end  of  the 
performance  period.  The  PI  contacted  the  program  officer  and  received  approval  to  acquire  a 
2DoF  Gripper  (quoted  at  $360).  The  PI  received  permission  to  utilize  the  remainder  of  the  funds 
originally  allocated  for  the  3DoF  gripper  on  purchasing  an  additional  Alienware  Area  51  desktop 
($2,249.99)  for  the  Virtual  Reality  client  module.  [$2,249.99] 

SUBTOTAL:  $2,249.99 

TOTAL  OTHER  DIRECT  COSTS:  $  20,104.16 


G.  Direct  Costs  (Total) 
$196,250.12 


H.  Indirect  Costs 

None  requested. 


I.  Total  Direct  and  Indirect  Costs  (Total  Federal  Request) 


$196,250.12 


Appendinx  B  -  Final  Budget 

Equipment 


Description 

Manufacturer/Vendor 

Costs  Acquisition  Special  Circumstance 

Vicon  Bonita  Video  Capture  &  Virtual  Reality  Client  Modules  | 

Bonita  720c  Video  Cameras 

Vicon 

10,152.22 

Bogen  super  clamp  short  studs 

Vicon 

90.00 

Microball  heads 

Vicon 

300.00 

Light  duty  tripods 

Vicon 

450.00 

Giganet  LAB  unit 

Vicon 

11,052.82 

Vicon  Active  Wand 

Vicon 

818.73 

Quad  Video  PC 

DELL 

2,922.86 

Dell  24-inch  monitors 

DELL/Vicon 

532.17 

Alienware  Aurora  R4  Desktop  computers 

Dell  Marketing  Lp 

6,859.98 

Dell  UltraSharp  24-inch  monitors 

Dell/Dell  Marketing  LP 

479.98 

VirtualGlove  RH 

Virtual  Realities 

5,000.00 

VirtualGlove  LH 

Virtual  Realities 

5,000.00 

Shipping  &  handling 

N/A 

100.00 

Computational  Backbone  (Quantum  Server,  Processors,  GPUs) 

Mercury  GPU408  4U  Tower  Server 
with  Three-Year  Standard  Warranty 

Advanced  Hpc  Inc 

14,569.56 

Mouse  &  Keyboard 

Advanced  Hpc  Inc 

28.41 

UltraSharp  24  Inch  VIS  monitor 

Advanced  Hpc  Inc 

261.36 

Microsoft  Window  7  Professional 

Advanced  Hpc  Inc 

153.41 

4T  SATA  6Gb/s  3.5  Inch  7.2  RPM  Disk  Drive 

Advanced  Hpc  Inc 

251.14 

Shipping  &  handling 

N/A 

250.00 

Autonomous  Robotics  Agents  (PeopleBot,  PatrolBot,  NAO)  1 

PeopleBot 

Adept  Mobile  Robots 

31,995.00 

Digital  stereo  camera 

Adept  Mobile  Robots 

9,995.00 

Serial  tether 

Adept  Mobile  Robots 

43.00 

Shipping  &  handling 

Adept  Mobile  Robots 

500.00 

PatrolBot  (VSLAM  PatrolBot  w/Color 

Stereo) 

Adept  Mobile  Robots 

44,995.00 

Arm  (Premium  Arm  (DX/AT)) 

Adept  Mobile  Robots 

14,995.00 

Gripper  (4  DOF  Gripper  for  Gamma  1500) 

Adept  Mobile  Robots 

350.00 

Shipping  &  handling  (freight) 

Adept  Mobile  Robots 

500.00 

NAO 

RobotsLAB  US  Inc. 

7,990.00 

Shipping  &  handling 

N/A 

354.00 

Discount 

N/A 

-144.00 

Materials  and  Supplies 


Alienware  17  (210-ACKC)  Laptop 

Dell  Marketing  Lp 

1,812.99 

External  DVD 

Dell  Marketing  Lp 

69.99 

Travel  briefcase  (Alienware  Vindicator 
Briefcase) 

Dell  Marketing  Lp 

99.99 

Consultant  Services  j 

Installation  &  training  for  the  Vicon  Bonita 
Video  Capture 

Vicon 

0.00 

ADP/Computer  Services  | 

Vicon  Nexus  2.0  Standalone 

Vicon 

15,871.20 

Special  Circumstances  j 

Alienware  Area  51  Desktop 

Dell  Marketing  Lp 

2,249.99  Please  see  section  F.5  on  page  23 

Equipment  Total 

$  190,949.80 

HSAP  and  URAP  Summer  2015  Apprenticeships 


Student  employees  Salary  &  Wages  Fringe  Totals 


$  2,360.00  $192.81  $  2,552.81 

$  2,540.00  $  207.52  $  2,747.52 


Melissa  Clark  (URAP) 
Lucas  Kabela  (HSAP) 


$  5,300.33 


Appendix  C  -  Evidence  of  UHV's  Additional  Support 


Evidence  of  UHV's  Current/Future  Financial  Support  of  Proposed  Research  &  Education 

The  following  items  are  on  order  or  are  scheduled  to  be  ordered  using  HEAF  funds  and  are 
scheduled  for  installation  and  usage  for  Fall  2014.  They  can  be  used  within  UHV's  existing  CAVE 
lab  and  its  current  system.  Other  technologies  and  equipment  currently  in  place  at  UHV's  CAVE 
lab  are  named  in  the  Project  Narrative.  These  costs  do  not  include  all  potential  maintenance 
and  operation  costs  or  other  costs  that  may  be  requested  from  HEAF  funds  in  the  coming 
years— only  those  for  which  dedicated  funding  already  is  arranged. 

Two  Virtuix  Omni  VR  Threadmill  (omni-directional  motion  interfaces)  of  $500  each  ($1,000 
total)  will  allow  users  to  move  freely  and  naturally  in  virtual  environments,  thereby  enabling 
us  to  research  human  studies  in  trust,  comfort,  and  performance  with  the  equipment.  [$1,000] 

Two  Low-latency  Oculus  Rift  v2  goggles  (head-mounted  displays)  of  $350  each  ($700  total)  are 
integrated  within  UHV's  existing  Unreal  4.x  game  engine  licenses  and  subscriptions.  For  the 
purposes  of  this  research,  they  are  necessary  to  immerse  the  user  into  the  virtual  world 
without  requiring  the  development  of  an  Application  Programming  Interface.  [$700] 

TOTAL  INSTITUTIONAL  COMMITMENT:  $1,700.00 


